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^jpj In mechanism design it is typical to impose incentive compatibility and then derive 

an optimal mechanism subject to this constraint. By replacing the incentive compat- 
ibility requirement with the goal of minimizing expected ex post regret, we are able 
to adapt statistical machine learning techniques to the design of payment rules. This 
computational approach to mechanism design is applicable to domains with multi- 
,— i dimensional types and situations where computational efficiency is a concern. Specifi- 

f-H cally, given an outcome rule and access to a type distribution, we train a support vector 

machine with a special discriminant function structure such that it implicitly estab- 

^ lishes a payment rule with desirable incentive properties. We discuss applications to 

Q a multi-minded combinatorial auction with a greedy winner-determination algorithm 

and to an assignment problem with egalitarian outcome rule. Experimental results 
demonstrate both that the construction produces payment rules with low ex post re- 

J> gret, and that penalizing classification errors is effective in preventing failures of ex 

post individual rationality. 

00 

1 Introduction 

00 

Mechanism design studies situations where a set of agents each hold private information 
about their preferences over different outcomes. The designer chooses a center that receives 
claims about such preferences, selects and enforces an outcome, and optionally collects 
payments. The classical approach is to impose incentive compatibility, ensuring that agents 
truthfully report their preferences in strategic equilibrium. Subject to this constraint, the 
goal is to identify a mechanism, i.e., a way of choosing an outcome and payments based 
on agents' reports, that optimizes a given design objective like social welfare, revenue, or 
some notion of fairness. 
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There are, however, significant challenges associated with this classical approach. First 
of all, it can be analytically cumbersome to derive optimal mechanisms for domains that 
are "multi-dimensional" in the sense that each agent's private information is described 
through more than a single number, and few results are known in this caseQ Second, 
incentive compatibility can be costly, in that adopting it as a hard constraint can preclude 
mechanisms with useful economic properties. For example, imposing the strongest form of 
incentive compatibility, truthfulness in a dominant strategy equilibrium or strategyproof- 
ness, necessarily leads to poor revenue, vulnerability to collusion, and vulnerability to 
false-name bidding in combinatorial auctions where valuations exhibit complementarities 
among items [21 [21] . A third difficulty occurs when the optimal mechanism has an outcome 
or payment rule that is computationally intractable. 

In the face of these difficulties, we adopt statistical machine learning to automatically 
infer mechanisms with good incentive properties. Rather than imposing incentive compat- 
ibility as a hard constraint, we start from a given outcome rule and use machine learning 
techniques to identify a payment rule that minimizes agents' expected ex post regret relative 
to this outcome rule. Here, the ex post regret an agent has for truthful reporting in a given 
instance is the amount by which its utility could be increased through a misreport. While 
a mechanism with zero ex post regret for all inputs is obviously strategyproof, we are not 
aware of any additional direct implication in terms of equilibrium properties^] Support for 
expected ex post regret as a quantifiable target for mechanism design rather comes from 
a simple model of manipulation where agents face a certain cost for strategic behavior. If 
this cost is higher than the expected gain, agents can be assumed to behave truthfully. 
We do insist on mechanisms in which the price to an agent, conditioned on an outcome, is 
independent of its report. This provides additional robustness against manipulation in the 
sense that there is no local price sensitivity^] 

Our approach is applicable to domains that are multi-dimensional or for which the com- 
putational efficiency of outcome rules is a concern. Given the implied relaxation of incentive 

x One example of a multi-dimensional domain is a combinatorial auction, where an agent's preferences 
are described by a numerical value for each of several different bundles of items. 

2 The expected ex post regret given a distribution over types provides an upper bound on the expected 
regret of an agent who knows its own type but has only distributional information on the types of other 
agents. The latter metric is also appealing, but does not seem to fit well with the generalization error of 
statistical machine learning. An emerging literature is developing various regret-based metrics for quan- 
tifying the incentive properties of mechanisms [191 [7J 1171 [5], and there also exists experimental support 
for a quantifiable measure of the divergence between the distribution on payoffs in a mechanism and that 
in a strategyproof reference mechanism like the VCG mechanism [18] . An earlier literature had looked 
for approximate incentive compatibility or incentive compatibility in the large-market limit, see, e.g., the 
recent survey by Carroll [5]. Related to the general theme of relaxing incentive compatibility is work of 
Pathak and Sonmez [20! that provides a qualitative ranking of different mechanisms in terms of the number 
of manipulable instances, and work of Budish [3] that introduces an asymptotic, binary, design criterion 
regarding incentive properties in a large replica economy limit. Whereas the present work is constructive, 
the latter seek to explain which mechanisms are adopted in practice. 

3 Erdil and Klemperer [8] consider a metric that emphasizes this property. 



2 



compatibility, the intended application is to domains in which incentive compatibility is 
unavailable or undesirable for outcome rules that meet certain economic and computational 
desiderata. The payment rule is learned on the basis of a given outcome rule, and as such 
the framework is most meaningful in domains where revenue considerations are secondary 
to outcome considerations. 

The essential insight is that the payment rule of a strategyproof mechanism can be 
thought of as a classifier for predicting the outcome: the payment rule implies a price to 
each agent for each outcome, and the selected outcome must be one that simultaneously 
maximizes reported value minus price for every agent. By limiting classifiers to discriminant 
functions^] with this "value-minus-price" structure, where the price can be an arbitrary 
function of the outcome and the reports of other agents, we obtain a remarkably direct 
connection between multi-class classification and mechanism design. For an appropriate 
loss function, the discriminant function of a classifier that minimizes generalization error 
over a hypothesis class has a corresponding payment rule that minimizes expected ex post 
regret among all payment rules corresponding to classifiers in this class. Conveniently, 
an appropriate method exists for multi-class classification with large outcome spaces that 
supports the specific structure of the discriminant function, namely the method of structural 
support vector machines [24, [12] . Just like standard support vector machines, it allows us 
to adopt non-linear kernels, thus enabling price functions that depend in a non- linear way 
on the outcome and on the reported types of other agents. 

In illustrating the framework, we focus on two situations where strategyproof payment 
rules are not available: a greedy outcome rule for a multi- minded combinatorial auction in 
which each agent is interested in a constant number of bundles, and an assignment problem 
with an egalitarian outcome rule, i.e., an outcome rule that maximizes the minimum value 
of any agent. The experimental results we obtain are encouraging, in that they demonstrate 
low expected ex post regret even when the 0/1 classification accuracy is only moderately 
good, and in particular better regret properties than those obtained through simple VCG- 
based payment rules that we adopt as a baseline. In addition, we give special consideration 
to the failure of ex post individual rationality, and introduce methods to bias the classifier 
to avoid these kinds of errors as well as post hoc adjustments that eliminate them. As 
far as scalability is concerned, we emphasize that the computational cost associated with 
our approach occurs offline during training. The learned payment rules have a succinct 
description and can be evaluated quickly in a deployed mechanism. 

Related Work 

Conitzer and Sandholm [6j introduced the agenda of automated mechanism design (AMD), 
which formulates mechanism design as an optimization problem. The output is the de- 
scription of a mechanism, i.e., an explicit mapping from types to outcomes and payments. 

4 A discriminant function can be thought of as a way to distinguish between different outcomes for the 
purpose of making a prediction. 
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AMD is intractable in general, as the type space can be exponential in both the number of 
agents and the number of items, but progress has recently been made in finding approxi- 
mate solutions for domains with additive value structure and symmetry assumptions, and 
adopting Bayes-Nash incentive compatibility (BIC) as the goal [3]. Another approach is 
to search through a parameterized space of incentive-compatible mechanisms [9]. 

A parallel literature allows outcome rules to be represented by algorithms, like our 
work, and thus extends to richer domains. Lavi and Swamy [15] employ LP relaxation to 
obtain mechanisms satisfying BIC for set-packing problems, achieving worst-case approxi- 
mation guarantees for combinatorial auctions. Hartline and Lucier [10] and Hartline et al. 
[TT] propose a general approach, applicable to both single-parameter and multi-parameter 
domains, for converting any approximation algorithm into a mechanism satisfying BIC 
that has essentially the same approximation factor with respect to social welfare. This 
approach differs from ours in that it adopts BIC as a target rather than the minimization 
of expected ex post regret. In addition, it evaluates the outcome rule on a number of 
randomly perturbed replicas of the instance that is polynomial in the size of a discrete 
type space, which is infeasible for combinatorial auctions where this size is exponential in 
the number of items. The computational requirements of our trained rule are equivalent 
to that of the original outcome rule. 

Lahaie \13\ IT4] also adopts a kernel-based approach for combinatorial auctions, but 
focuses not on learning a payment rule for a given outcome rule but rather on solving 
the winner determination and pricing problem for a given instance of a combinatorial 
auction. Lahaie introduces the use of kernel methods to compactly represent non-linear 
price functions, which is also present in our work, but obtains incentive properties more 
indirectly through a connection between regularization and price sensitivity. 



2 Preliminaries 

A mechanism design problem is given by a set iV = {1, 2, . . . , n} of agents that interact to 
select an element from a set 0, C XigArOj of outcomes, where fij denotes the set of possible 
outcomes for agent i 6 N. Agent i £ N is associated with a type 8{ from a set 0j of 
possible types, corresponding to the private information available to this agent. We write 
9 = . . . , 6 n ) for a profile of types for the different agents, = Xj g Ar0j for the set of 
possible type profiles, and 6-i S 0-j for a profile of types for all agents but i. Each agent 
i £ N is further assumed to employ preferences over f2j, represented by a valuation function 
Vi : Bj x Q{ — )• K. We assume that for all i £ N and Qi £ 0j there exists an outcome o € f2 
with Vi{9i,Oi) = 0. 

A (direct) mechanism is a pair (g,p) of an outcome rule g : — >• Xj gj /\rf2j and a payment 
rule p : — )■ M> . The intuition is that the agents reveal to the mechanism a type profile 
9 £ 0, possibly different from their true types, and the mechanism chooses outcome g(9) 
and charges each agent i a payment of = {p{9))i. We assume quasi-linear preferences, 
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so the utility of agent i with type 9, L G 6« given a profile 9' G of revealed types is 
Ui(8',6i) = Vi(6i, gi(9')) —pi(0'), where gi(0) = (g(O)i) denotes the outcome for agent i. A 
crucial property of mechanism (g,p) is that its outcome rule is feasible, i.e., that g{9) G Q 
for all 6eO. 

Outcome rule g satisfies consumer sovereignty if for all i G N, Oi G Qi, and G 0-i, 
there exists 9[ G ©j such that gi(9' i , 9'_j) = of, and reachability of the null outcome if for all 
% G N, Oi G 0j, and 0^ G 9_i, there exists 9[ G 6, such that Vi(Oi,g i (9' i ,0'_ i )) = 0. 

Mechanism (g,p) is dominant strategy incentive compatible, or strategyproof if each 
agent maximizes its utility by reporting its true type, irrespective of the reports of the other 
agents, i.e., if for all i G N, 9 { G 9 f , and 9' = (9^,9'^) G 9, n i ((^,^),0 i ) > ^((^, 9'^), 9i); 
it satisfies individual rationality (IR) if agents reporting their true types are guaranteed 
non-negative utility, i.e., if for all i G N, 9i G 9j, and 9'_ i G 9_j, Ui{{9i,9'_^),9i) > 0. 
Observe that given reachability of the null outcome, strategyproofness implies individual 
rationality. 

It is known that a mechanism (g,p) is strategyproof if and only if the payment of an 
agent is independent of its reported type and the chosen outcome simultaneously maximizes 
the utility of all agents, i.e., if for every 9 G O, 

Pi (9) = n(9-i,gi(9)) for all i G N, and (1) 

gi{9) G argmax(^(6>;,c4) - ^(9^,0^) for all i G N, (2) 

for a price function ti : 9_i x Q{ — > R. This simple characterization is crucial for the main 
results in the present paper, providing the basis with which the discriminant function of a 
classifier can be used to induce a payment rule. 

In addition, a direct characterization of strategyproofness in terms of monotonicity 
properties of outcome rules explains which outcome rules can be associated with a payment 
rule in order to be "implementable" within a strategyproof mechanism [22\ [I]. These 
monotonicity properties provide a fundamental constraint on when our machine learning 
framework can hope to identify a payment rule that provides full strategyproofness. 

We quantify the degree of strategyproofness of a mechanism in terms of the regret 
experienced by an agent when revealing its true type, i.e., the potential gain in utility by 
revealing a different type instead. Formally, the ex post regret of agent i G N in mechanism 
(g,p), given true type 9i G 9j and reported types 9'_ i G 9_j of the other agents, is 

rgtM,^) = maxu^,^),^) - Ui{{9 h 9'_i)A)- 

Analogously, the ex post violation of individual rationality of agent i G N in mechanism 
(g,p), given true type 9i G ®i and reported types 9'_ i G 9-i of the other agents, is 

irvi(9i,9'_i) = \mm(ui{{9 i ,9'_i),9i),0)\. 
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We consider situations where types are drawn from a distribution with probability- 
density function D : 6 — > R such that D{9) > and J 9£ qD(9) = 1. Given such a 
distribution, and assuming that all agents report their true types, the expected ex post 
regret of agent i G TV in mechanism (g,p) is ^e~D[rgti(6i, 

Outcome rule g is ageni symmetric if for every permutation 7r of iV and all types 
9,9' € & such that 0; = 0^ (i) for all i G AT, 5i (0) = g n ^(9') for all i G AT. Note that this 
specifically requires that ©j = Qj and = Qj for all i, j G A 7 ". Similarly, type distribution 
D is agent symmetric if -D(0) = D(9') for every permutation 7r of N and all types 9,9' G 
such that 0j = 0^ for all i e N. Given agent symmetry, a price function t\ : 6_i x — >■ R 
for agent 1 can be used to generate the payment rule p for a mechanism (g,p), with 

p(9) = (t 1 (9- 1 ,g 1 (9)),t 1 (0-2,92(9)),...,t 1 (8-n,9n(9))), 

so that the expected ex post regret is the same for every agent. 

We assume agent symmetry in the sequel, which precludes outcome rules that break 
ties based on agent identity, but obviates the need to train a separate classifier for each 
agent while also providing some benefits in terms of presentation. Because ties occur only 
with negligible probability in our experimental framework, the experimental results are not 
affected by this assumption. 

3 Payment Rules from Mult i- Class Classifiers 

A multi- class classifier is a function h : X — > Y, where X is an input domain and Y is a 
discrete output domain. One could imagine, for example, a multi-class classifier that labels 
a given image as that of a dog, a cat, or some other animal. In the context of mechanism 
design, we will be interested in classifiers that take as input a type profile and output an 
outcome. What distinguishes this from an outcome rule is that we will impose restrictions 
on the form the classifier can take. 

Classification typically assumes an underlying target function h* : X — > Y, and the 
goal is to learn a classifier h that minimizes disagreements with h* on a given input dis- 
tribution D on X, based only on a finite set of training data {(x 1 , y l ), . . . , (x , y £ )} = 
{(x 1 , /^(x 1 )), . . . , (x e , h*(x e ))} with x 1 , . . . , x e drawn from D. This may be challenging be- 
cause the amount of training data is limited, or because h is restricted to some hypothesis 
class % with a certain simple structure, e.g., linear threshold functions. If h(x) = h*(x) 
for all x G X, we say that h is a perfect classifier for h* . 

We consider classifiers that are defined in terms of a discriminant function / :Xx7-> 
R, such that 

h(x) G argmax/(x,y) 

yeY 

for all x G X. More specifically, we will be concerned with linear discriminant functions of 
the form 

fw{x,y) = w T i)(x,y) 
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for a weight vector w G M m and a feature map ip : X x Y -> M m , where m G NU{oo}Q The 
function "0 maps input and output into an m-dimensional space, which generally allows 
non-linear features to be expressed. 

3.1 Mechanism Design as Classification 

Assume that we are given an outcome rule g and access to a distribution D over type 
profiles, and want to design a corresponding payment rule p that gives the mechanism 
(g,p) the best possible incentive properties. Assuming agent symmetry, we focus on a 
partial outcome rule g\ : — >• J2i and train a classifier to predict the outcome to agent 1. 
To train a classifier, we generate examples by drawing a type profile 9 G from distribution 
D and applying outcome rule g to obtain the target class g\{9) G fti. 

We impose a special structure on the hypothesis class. A classifier h w : — > Oi is 
admissible if it is defined in terms of a discriminant function f w of the form 

/u,(0, Oi) = lOiUl(01, Oi ) + wT. 1 lj}(6-.l,Ol) 

for weights to such that toi G R>o and w-i G M m , and a feature map ^ : 0—1 X Oi — » M m 
for m G N U {oo}. 

The first term of f w (0,ox) only depends on the type of agent 1 and increases in its 
valuation for outcome oi, while the remaining terms ignore 9\ entirely. This restriction 
allows us to directly infer agent-independent prices from a trained classifier. For this, define 
the associated price function of an admissible classifier h w as 

t w (9-i,oi) = — —10^1^(0-1,01), 

Wl 

where we again focus on agent 1 for concreteness. By agent symmetry, we obtain the 
mechanism (g,p w ) corresponding to classifier h w by letting 

Pw(9) = (t w (9- 1 ,gi(e)),t w (9^ 2 ,g 2 (9)), . . . , t w {9. n , g n {9))) . 

Even with admissibility, appropriate choices for the feature map tp will produce rich 
families of classifiers, and thus ultimately useful payment rules. Moreover, this form is 



compatible with structural support vector machines, discussed in Section 4.1 



3.2 Example: Single-Item Auction 

Before proceeding further, we illustrate the ideas developed so far in the context of a 
single-item auction. In a single-item auction, the type of each agent is a single number, 



5 We allow w to have infinite dimension, but require the inner product between w and ip(x, y) to be 
defined in any case. Computationally the infinite-dimensional case is handled through the kernel trick, 



which is described in Section 4.1.1 
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corresponding to its value for the item being auctioned, and there are two possible alloca- 
tions from the point of view of agent 1: one where it receives the item, and one where it 
does not. Formally, = 1" and Qx = {0, 1}. 

Consider a setting with three agents and a training set 

(#\ O }) = ((l,3,5),0), (9 2 ,oj) = ((5,4,3), 1), (6 3 , of) = ((2, 3, 4), 0), 

and note that this training set is consistent with an optimal outcome rule, i.e., one that 
assigns the item to an agent with maximum value. Our goal is to learn an admissible 
classifier 

h w (6) = argmax f w (9,ox) = argmax w\Vi(6i,oi) + w/^i^ffl-i, oi) 
oie{o,i} oie{o,i} 

that performs well on the training set. Since there are only two possible outcomes, the 
outcome chosen by h w is simply the one with the larger discriminant. A classifier that is 
perfect on the training data must therefore satisfy the following constraints: 

wi • + ur^((3, 5), 0) > wx • 1 + ^^((3, 5), 1), 
wx ■ 5 + u£i^((4, 3), 1) > wx ■ + ^^((4, 3), 0), 
wx-0 + ur^((3, 4), 0) > w x ■ 2 + w^{(3, 4), 1). 

This can for example be achieved by setting w\ = 1 and 

T a, a Q \ ^ f-max(0 2 ,03) if oi = 1 and 
M-lW2,03),Ol) = < (3) 

10 if ox = 0. 

Recalling our definition of the price function as t w (9^i,oi) = —(l/wx)wT_xip(9-x,ox), 
we see that this choice of w and if) corresponds to the second-price payment rule. We will 
see in the next section that this relationship is not a coincidence^] 

3.3 Perfect Classifiers and Implementable Outcome Rules 

We now formally establish a connection between implementable outcome rules and perfect 
classifiers. 

Theorem 1. Let (g,p) be a strategyproof mechanism with an agent symmetric outcome 
rule g, and let tx be the corresponding price function. Then, a perfect admissible classifier 
h w for partial outcome rule gx exists if argmax 0lg Q 1 (vx(0x,ox) — tx(6-x, oi))) is unique. 

6 In practice, we are limited in the machine learning framework to hypotheses that are linear in 
ip{{92, 03), 01), and will not be able to guarantee that |3| holds exactly. In Section 4.1.1 we will see, how- 



ever, that certain choices of ip allow for very complex hypotheses that can closely approximate arbitrary 
functions. 
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Proof. By the first characterization of strategyproof mechanisms, g must select an outcome 
that maximizes the utility of agent 1 at the current prices, i.e., 

gi (0) G argmax(vi(0i,oi) - *i(0-i, 01)). 

Oi£Ui 

Consider the admissible discriminant /n ! i)(0, 01) = ui(0i,oi) — ii(0_i,oi), which uses the 
price function ii as its feature map. Clearly, the corresponding classifier 7i(i,i) maximizes 
the same quantity as g%, and the two must agree if there is a unique maximizer. □ 

The relationship also works in the opposite direction: a perfect, admissible classifier h w 
for outcome rule g can be used to construct a payment rule that turns g into a strategyproof 
mechanism. 

Theorem 2. Let g be an agent symmetric outcome rule, h w : — > Oi an admissible 
classifier, and p w the payment rule corresponding to h w . If h w is a perfect classifier for the 
partial outcome rule g\, then the mechanism (g,p w ) is strategyproof. 

We prove this result by expressing the regret of an agent in mechanism (g,p w ) in terms 
of the discriminant function f w . Let Oj(0_j) C f2j denote the set of partial outcomes for 
agent i that can be obtained under g given reported types 9-% from all agents but i, keeping 
the dependence on g silent for notational simplicity. 

Lemma 1. Suppose that agent 1 has type 9\ and that the other agents report types 9-\. 
Then the regret of agent 1 for bidding truthfully in mechanism (g,p w ) is 

— ( max f w {9,oi)- f w (6,gi (0))). 

wi oien(0_i) 

Proof. We have 

rgt x {9) = max ( Vl (9 h gi (9[, 0^)) - Pwtl (0[, 0-i)) - ( Vl (e u 9l {9)) - PwA (8)) 

= max ( ? ;i(0 1 , Ol )-t w (0_ 1 ,oi)) - (vi (0i, 5 i (0))-t m (0_i, (0))) 
oieHi^-i) 

= max («i(0i,oi) + —w T M8-i,oi)) - (v 1 (9 1 , 9l (9)) + — ^ 1 ^(0_i, 5 i(0))) 

= — ( max f w (9,oi) - f w (9,gi(9))). n 

Proof of Theorem^ If h w is a perfect classifier, then the discriminant function f w satis- 
fies argmax 0ie n 1 f w (9,o\) = g\{9) for every 9 G 0. Since 51(0) G f2i(0_i), we thus have 
that max oie m(0_i) fw(9, 01) = f w (9, gi(9)). By Lemma [Tl the regret of agent 1 for bid- 
ding truthfully in mechanism (g,p w ) is always zero, which means that the mechanism is 
strategyproof. □ 
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It bears emphasis that classifier h w is only used to derive the payment rule p w , while 
the outcome is still selected according to g. In principle, classifier h w could be used to 
obtain an agent symmetric outcome rule g w and, since h w is a perfect classifier for itself, 
a strategyproof mechanism {g<wi p w ). Unfortunately, outcome rule g w is not in general 
feasible. Mechanism (g,p w ), on the other hand, is not strategyproof when h w fails to be 
a perfect classifier for g. While payment rule p w always satisfies the agent-independence 
property ([!]) required for strategyproofness, the "optimization" property ^ might be vi- 
olated when h w {9) 7^ g\(9). 

3.4 Approximate Classification and Approximate Strategyproofness 

A perfect admissible classifier for outcome rule g leads to a payment rule that turns g into 
a strategyproof mechanism. We now show that this result extends gracefully to situations 
where no such payment rule is available, by relating the expected ex post regret of a 
mechanism (g,p) to a measure of the generalization error of a classifier for g. 

Fix a feature map if), and denote by the space of all admissible classifiers with this 
feature map. The discriminant loss of a classifier h w G %^ with respect to a type profile 
9 and an outcome o\ G is given by 



Intuitively the discriminant loss measures how far, in terms of the normalized discriminant, 
h w is from predicting the correct outcome for type profile 9, assuming the correct outcome is 
01. Note that A(o u 6) > for all o x G fii and 9 G G, and A(oi, 9) = if 01 = h w (9). Note 
further that h w {9) = h w /(9) does not imply that A w (oi,9) = A w i{o\,9) for all o\ € Q%: 
even if two classifiers predict the same outcome, one of them may still be closer to predicting 
the correct outcome o\. 

The generalization error of classifier h w G %^ with respect to a type distribution D 
and a partial outcome rule g\ : — > Q\ is then given by 



Jeee 

The following result establishes a connection between the generalization error and the 
expected ex post regret of the corresponding mechanism. 

Theorem 3. Consider an outcome rule g, a space %^ of admissible classifiers, and a 
type distribution D. Let h w * G %^ be a classifier that minimizes generalization error with 
respect to D and g among all classifiers in rl^p. Then the following holds: 

1. If g satisfies consumer sovereignty, then (g,p w *) minimizes expected ex post regret 
with respect to D among all mechanisms (g,p w ) corresponding to classifiers h w G H^. 
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2. Otherwise, (g,p w *) minimizes an upper bound on expected ex post regret with respect 
to D amongst all mechanisms (g,p w ) corresponding to classifiers h w G H^- 

Proof. For the second property, observe that 

A w (9i(0),9) = —(f w (9,h w (e))-f w (6, gi (6))) 
= — (max / w (0,oi) - fw(0,gi(0))) 
>— ( max J v ,(p 1 o 1 )-f v ,(p 1 g 1 (0)))=rgt 1 (9), 

where the last equality holds by Lemma [TJ If g satisfies consumer sovereignty, then the 
inequality holds with equality, and the first property follows as well. □ 

Minimization of expected regret itself, rather than an upper bound, can also be achieved 
if the learner has access to the set for every 0-\ G 0_i. 



4 A Solution using Structural Support Vector Machines 

In this section we discuss the method of structural support vector machines (structural 
S VMs) [2H [T2] , and show how it can be adapted for the purpose of learning classifiers with 
admissible discriminant functions. 



4.1 Structural SVMs 

Given an input space X, a discrete output space Y, a target function h* : X — >• Y, and a 
set of training examples {(x 1 , /i*(x 1 )), . . . , (ar, h*(x ))} = {(x 1 , y 1 ), . . . , (x , y )}, structural 
SVMs learn a multi-class classifier h that on input x G X selects an output y G Y that 
maximizes f w (x,y) = w T i/j(x,y). For a given feature map ip, the training problem is to 
find a vector w for which h w has low generalization error. 

Given examples {(x 1 , y 1 ), . . . , (x , ?/)}, training is achieved by solving the following 
convex optimization problem: 

f 

1 C 

min -w T w + — >^ f k (Training Problem 1) 

s — k=i 

s.t. w T ^{x\y k )-^{x\y)) >C(y k ,y)-£ k for all k = 1, . . . ,£, y G Y 
£ k >0 for all k = 1,...,£. 

The goal is to find a weight vector w and slack variables £ fc such that the objective function 
is minimized while satisfying the constraints. The learned weight vector w parameterizes 
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the discriminant function f w , which in turn defines the classifier h w . The fcth constraint 
states that the value of the discriminant function on (x k ,y k ) should exceed the value of 
the discriminant function on (x k ,y) by at least C(y k ,y), where £ is a loss function that 
penalizes misclassification, with C(y, y) = and £(y, y') > for all y, y' G Y. We generally 
use a 0/1 loss function, but consider an alternative in Section 4.2.2 to improve ex post IR 
properties. Positive values for the slack variables £ k allow the weight vector to violate some 
of the constraints. 

The other term in the objective, the squared norm of w, penalizes scaling of w. This is 
necessary because scaling of w can arbitrarily increase the margin between f w (x k ,y k ) and 
fw( x j y) an d make the constraints easier to satisfy. Smaller values of w, on the other hand, 
increases the ability of the learned classifier to generalize by decreasing the propensity to 
over-fit to the training data. Parameter C is therefore a regularization parameter: larger 
values of C encourage small £ k and larger w, such that more points are classified correctly, 
but with a smaller margin. 



4.1.1 The Feature Map and the Kernel Trick 

Given a feature map ip, the feature vector ip(x,y) for x G X and y G Y provides an alternate 
representation of the input-output pair (x,y). It is useful to consider feature maps tp for 
which t/j(x, y) = (p(x(x, y)), where \ : X x Y — > M s for some s G N is an attribute map that 
combines x and y into a single attribute vector x( x ?y) compactly representing the pair, 
and <j) : W 3 — > R m for m > s maps the attribute vector to a higher-dimensional space in 
a non-linear way. In this way, SVMs can achieve non-linear classification in the original 
space. 

While we work hard to keep s small, the so-called kernel trick means that we do not have 
the same problem with m: it turns out that in the dual of Training Problem 1, ip(x, y) only 
appears in an inner product of the form {ip(x, y), ip(x', y')), or, for a decomposable feature 
map, (<f>(z) , <f>(z')} where z = x{ x iV) an d z' = x(x',y'). For computational tractability it 
therefore suffices that this inner product can be computed efficiently, and the "trick" is to 
choose 4> such that (cp(z),4>(z')) = K(z,z') for a simple closed-form function K, known as 
the kernel. 

In this paper, we consider polynomial kernels K po i yc i, parameterized by d G N + , and 
radial basis function (RBF) kernels Krbf-, parameterized by 7 = 1/(2<t 2 ) for a G M + : 

K po iyd(z, z) = (z ■ z') d , 

K RBF (z, z') = exp (-7 (||z|| 2 + Hz'll 2 - 2z ■ z')) . 

Both polynomial and RBF kernels use the standard inner product of their arguments, so 
their efficient computation requires that x{ x i u) ' x( x i u') can be computed efficiently. 
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4.1.2 Dealing with an Exponentially Large Output Space 



Training Problem 1 has constraints, where Y is the output space and I the number 

of training instances, and enumerating all of them is computationally prohibitive when Y 
is large. Joachims et al. |12j address this issue for structural SVMs through constraint 
generation: starting from an empty set of constraints, this technique iteratively adds a 
constraint that is ma ximally violated by the current solution until that violation is below 
a desired threshold e. Joachims et al.lshow that this will happen after no more than O(^) 
iterations, each of which requires 0(£) time and memory. However, this approach assumes 
the existence of an efficient separation oracle, which given a weight vector w and an input 
x finds an output y G argmax yg y f w (x, y). The existence of such an oracle remains an 
open question in application to combinatorial auctions; see Section 5.1.3 for additional 
discussion. 



4.1.3 Required Information 

In summary, the use of structural SVMs requires specification of the following: 

1. The input space X, the discrete output space Y, and examples of input-output pairs. 

2. An attribute map x '■ X x Y — > M s . This function generates an attribute vector that 
combines the input and output data into a single object. 

3. A kernel function K(z, z'), typically chosen from a well-known set of candidates, e.g., 
polynomial or RBF. The kernel implicitly calculates the inner product ((f)(z),(j)(z')), 
e.g., between a mapping of the inputs into a high dimensional space. 

4. If the space Y is prohibitively large, a routine that allows for efficient separation, i.e., 
a function that computes argmax^gy f w (x, y) for a given w, x. 

In addition, the user needs to stipulate particular training parameters, such as the regu- 
larization parameter C, and the kernel parameter 7 if the RBF kernel is being used. 



4.2 Structural SVMs for Mechanism Design 

We now specialize structural SVMs such that their learned discriminant function will mani- 
fest as a payment rule for a given symmetric outcome function g and distribution D. In this 
application, the input domain X is the space of type profiles G, and the output domain Y is 
the space £li of outcomes for agent 1. Thus we construct training data by sampling 9 ~ D 
and applying g to these inputs: {(9\ 9l (9 1 )), . . . , (9 e , gi (9 e ))} = {(9\o\), . . . , {9 l ,o[)}. For 
admissibility of the learned hypothesis h w {9) = argmax^gnj w T ^(9, 01), we require that 

i>(0,o x ) = (vi(0i,oi), 11/(9-!, 01)) 
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When learning payment rules, we therefore use an attribute map x' '■ S— 1 X Oi — >M S rather 
than % : O x fii — >• M s , and the kernel <j)' we specify will only be applied to the output of 
x' ■ This results in the following more specialized training problem: 

e 

1 C 

min —w T w + — >^ £ k (Training Problem 2) 

k=l 

s.t. (w 1 v 1 (e k 1 , o\) + wl^ieti, o k )) - (w lVl (e k , Gl ) + wl^iet^ox)) > c(o k , 0l ) - e fc 

for all A; = 1, . . . , £, o\ G fli 

£ fc > for all fc = 1,...,£ 

If w\ > then the weights u> together with the feature map ip 1 define a price function 
t w (0-i,oi) = —(1/wi)w'^ 1 i/j ; (9-i,oi) that can be used to define payments p w (0), as de- 
scribed in Section [3. 1| In this case, we can also relate the regret in the induced mechanism 
(g,p w ) to the classification error as described in Section |5~3} 



Theorem 4. Consider training data {(9 1 , o\), ...,(#, of)}. Let g be an outcome function 
such that gi(9 k ) = o\ for all k. Let w,£ k be the weight vector and slack variables output 
by Training Problem 2, with w\ > 0. Consider corresponding mechanism (g,p w ). For each 

e k , 

rgh(0 k ) < —i k 

Proof. Consider input 6 k . The constraints in the training problem impose that for every 
outcome o\ G fii, 

w lVl (0 k , of) + w J l 1 ^(e k 1 ,o k ) - (w 1 v 1 (6 k u o 1 ) + ^'(0*1, oi)) > C(o\, o x ) - i k 
Rearranging, 

e > c(o k , Gl ) + {w lVl {e k , 0l ) + wl^'wii, oi)) - (^(flf, of) + ^'(0*1, of)) 

c{o k , G1 ) + 01 ) - j) 

This inequality holds for every oi G fii, so 

i k > max oi) + / w (e fc ,oi) of)) 

> max (f w (0\o l )-f w (O k ,o k )) 

> w x rgt x (e k ) 



where the second inequality holds because C{o\,o\) > 0, and the final inequality follows 
from Lemma [T] This completes the proof. □ 
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We choose not to enforce W\ > explicitly in Training Problem 2, because adding this 
constraint leads to a dual problem that references ip' outside of an inner product and thus 
makes computation of all but linear or low-dimensional polynomial kernels prohibitively 
expensive. Instead, in our experiments we simply discard hypotheses where the result of 
training is ui\ < 0. This is sensible since the discriminant function value should increase 
gent's value increases, and negative values of w\ typically mean that the training 
parameter C or the kernel parameter 7 (if the RBF kernel is used) are poorly chosen. It 
turns out that w\ is indeed positive most of the time, and for every experiment a majority 
of the choices of C and 7 yield positive w\ values. For this reason, we do not expect the 
requirement that w\ > to be a problem in practice^] 

4.2.1 Payment Normalization 

One issue with the framework as stated is that the payments p w computed from the solution 
to Training Problem 2 could be negative. 

We solved this problem by normalizing payments, using a baseline outcome o&: if there 
exists an outcome d such that v\{9i,d) = for every 9\, this "null outcome" is used as 
the baseline; otherwise, we use the outcome with the lowest payment. Let t w {9-\,o\) be 
the price function corresponding to the solution w to Training Problem 2. Adopting the 
baseline outcome, the normalized payments t' w (9-i,oi) are defined as 

t' w (9-i, 01) = max(0,t w (6Li,oi) - t w (9-i,o h )). 

Note that Ob is only a function of 0-i, even when there is no null outcome, so t' w is still 
only a function of 0-\ and o\. 

4.2.2 Individual Rationality Violation 

Even after normalization, the learned payment rule p w may not satisfy IR. We offer three 
solutions to this problem, which can be used in combination. 

Payment offsets One way to decrease the rate of IR violation is to add a payment offset, 
which decreases all payments (for all type reports) by a given amount. We apply this 
payment offset to all bundles other than o^; as with payment normalization, the adjusted 
payment is set to if it is negative]^] Note that payment offsets decrease IR violation, but 
may increase regret. For instance, suppose there are only two outcomes 011,012, where 
012 is the null outcome. Suppose agent 1 values on at 5 and receives the null outcome if 
he reports truthfully. Suppose further that payments t w are 7 for on and for the null 
outcome. With no payment offset, the agent experiences no regret, since he receives utility 

7 For multi-minded combinatorial auctions, 1049/1080 > 97% of the trials had positive wi, for the 
assignment problem all of the trials did; see Section [5] for details. 

8 It is again crucial that depends only on 6-1, so that the payment remains independent of 8 1 given o\. 
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from the null outcome, but negative utility from o\\. However, if the payment offset 
is greater than 2, the agent's regret becomes positive (assuming consumer sovereignty) 
because he could have reported differently and received o\\ and received positive utility. 

Adjusting the loss function C We incur an IR violation when there is a null outcome 
o nu u such that g x (8) / o nuU and f w (0,o null ) > f w (9,gi(9)) for some type 9, assuming 
truthful reports. This happens because fw(9,o\) is a scaled version of the agent's utility 
for outcome o\ under payments p w . If the utility for the null outcome is greater than the 
utility for gi(6), then the payment t w (9-i, gi(9)) must be greater than v\(Q\, gi(9)), causing 
an IR violation. We can discourage these types of errors by modifying the constraints of 
Training Problem 2: when o\ ^ o nu u and o\ = o nu u, we can increase £(o^,oi) to heavily 
penalize misclassifications of this type. With a larger C{o\, oi), a larger £ fe will be required if 
fw(9, o\) < f w (0, o nu u). As with payment offsets, this technique will decrease IR violations 
but is not guaranteed to eliminate all of them. In our experimental results, we refer to this 
as the null loss fix, and the null loss refers to the value we choose for C{o\,o nu u) where 
o\ / o nu u. 

Deallocation In settings that have a null outcome and are downward closed (i.e., settings 
where a feasible outcome o remains feasible if Oj is replaced with the null outcome), we 
modify the function g to allocate the null outcome whenever the price function t w creates 
an IR violation. This reduces ex post regret and in particular ensures ex post IR. On 
the other hand, the total value to the agents necessarily decreases under the modified 
allocation. In our experimental results, we refer to this as the deallocation fix. 

5 Applying the Framework 

In this section, we discuss the application of our framework to two domains: multi- minded 
combinatorial auctions and egalitarian welfare in the assignment problem. 

5.1 Multi-Minded Combinatorial Auctions 

A combinatorial auction allocates items {1, . . . , r} among n agents, such that each agent 
receives a possibly empty subset of the items. The outcome space S7j for agent i thus 
is the set of all subsets of the r items, and the type of agent i can be represented by a 
vector 6i 6 0« = M? r that specifies its value for each possible bundle. The set of possible 
type profiles is then = R 2r ™, and the value Vi{9i,Oi) of agent i for bundle Oj is equal to 
the entry in 9{ corresponding to Oj. We require that valuations are monotone, such that 
Vi(9i,0i) > Vi(9i,o' i ) for all Oi,d i 6 f2j with o\ C Oj, and normalized such that Vj(0j,0) = 0. 
Assuming agent symmetry and adopting the view of agent 1, the partial outcome rule 
<7i : — > f2i specifies the bundle g\{9) allocated to agent 1; we require feasibility, so that 
no item is allocated more than once. 
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In a multi-minded CA, each agent is interested in at most b bundles for some constant 
b. The special case where b = 1 is called a single-minded CA. In our framework, the 
restriction to multi-minded CAs leads to a number of computational advantages. First, 
valuation profiles and thus the training data can be represented in a compact way, by 
explicitly writing down the valuations for the constant number of bundles each agent is 
interested in. Second, inner products between valuation profiles, which are required to 
apply the kernel trick, can be computed in constant time. 

5.1.1 Attribute Maps 

To apply structural SVMs to multi- minded CAs, we need to specify an appropriate at- 
tribute map x- I n our experiments we use two attribute maps xi '■ @-i x ^1 — * M 2 ^ 2 ^™ -1 ^ 
and X2 : Q-i xOi^ IR 2 ''^ 1 ), which are defined as follows: 



Xi(0-i,oi) 



" " 




► dec{ 0l ){2 r {n 









9-1 











> (2 r - dec(oi) 


_ . 







, X2(0-l,O 1 ) 



e 2 \oi 

03 \0! 
On \ Ol 



Here, dec(oi) = Yfj=i 2 J Ijeoj is a decimal index of bundle oi, where Ij£ 0l = 1 if i € oi 
and Ijeoi = otherwise. Attribute map xi thus stacks the vector 6-\, which represents 
the valuations of all agents except agent 1, with zero vectors of the same dimension, where 
the position of 6-i is determined by the index of bundle o\. The resulting attribute vector 
is simple but potentially restrictive. It precludes two instances with different allocated 
bundles from sharing attributes, which provides an obstacle to generalization of the dis- 
criminant function across bundles. Attribute map X2 stacks vectors Qi\o\, which are 
obtained from Oi by setting the entries for all bundles that intersect with o\ to 0. This 
captures the fact that agent i cannot be allocated any of the bundles that intersect with 
oi if oi is allocated to agent 1 



5.1.2 Efficient Computation of Inner Products 

Efficient computation of inner products is possible for both Xi?X2- A full discussion can 
be found in Appendix [A} 

9 Both xi and \i are defined for a particular number of items and agents, and in our experiments we 
train a different classifier for each number of agents and items. In practice, one can pad out items and 
agents by setting bids to zero and train a single classifier. 
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5.1.3 Dealing with an Exponentially Large Output Space 

Recall that Training Problems 1 and 2 have constraints for every training example (6 k ,o±) 
and every possible bundle of items o\ G fli, of which there are exponentially many in the 
number of items in the case of CAs. In lieu of an efficient separation oracle, a workaround 
exists when the discriminant function has additional structure, such that the induced pay- 
ment weakly increases as items are added to a bundle. Given this item monotonicity, it 
would suffice to include constraints for bundles that have a strictly larger value to the agent 
than any of their respective subsets. 

Still, it remains an open problem whether item monotonicity itself can be imposed on 
the hypothesis class with a small number of constraints^ An alternative is to optimistically 
assume item monotonicity, only including the constraints associated with bundles that 
are explicit in agent valuations. The baseline experimental results in Section [6] do not 
assume item monotonicity and instead use a separation oracle that iterates over all possible 
bundles o\ € We also present results which test the idea of optimistically assuming 
item monotonicity, and while there is a degradation in performance, results are mostly 
comparable. 

5.2 The Assignment Problem 

In the assignment problem, we are given a set of n agents and a set {1, . . . ,n} of items, 
and wish to assign each item to exactly one agent. The outcome space of agent i is thus 
Qi = {l,...,n}, and its type can be represented by a vector ^ G 6, = M". The set 

2 

of possible type profiles is then = 1" . We consider an outcome rule that maximizes 
egalitarian welfare in a lexicographic manner: first, the minimum value of any agent is 
maximized; if more than one outcome achieves the minimum, the second lowest value is 
maximized, and so forth. This outcome rule can be computed by solving a sequence of 
integer programs. As before, we assume agent symmetry and adopt the view of agent 1. 
To complete our specification of the structural SVM framework for this problem, we 

2 

need to define an attribute map X3 '■ ^ n ~ n x N -> K s , where the first argument is the type 
profile of all agents but agent 1 , the second argument is the item assigned to agent 1 , and s 
is a dimension of our choosing. A natural choice for X3 is to set 

Xs(e-i,j) = (6 2 [-j],6 3 [-j},...,6 n [-j]) G M^- 1 ) 2 , 

10 For polynomial kernels and certain attribute maps, a possible sufficient condition for item monotonicity 
is to force the weights W-i to be negative. However, as with the discussion of enforcing Wi > directly, 
these weight constraints do not dualize conveniently and results in the dual formulation no longer operate 
on inner products {ip'(9-i, Oi), rp'{0'_ 1 , As a result, we would be forced to work in the primal, and incur 
extra computational overhead that increases polynomially with the kernel degree d. We have performed 
some preliminary experiments with polynomial kernels, but we have not looked into reformulating the 
primal to enforce item monotonicity. 



18 



where 9i[— j] denotes the vector obtained from 6% by removing the jth entry. The attribute 
map thus reflects the agents' values for all items except item j, capturing the fact that the 
item assigned to agent 1 cannot be assigned to any other agent. Since the outcome space 
is very small, we choose not to use a non-linear kernel on top of this attribute vector. 

6 Experimental Evaluation 

We perform a series of experiments to test our theoretical framework. To run our ex- 
periments, we use the SVM struct package [12], which allows for the use of custom kernel 
functions, attribute maps, and separation oracles. 

6.1 Setup 

We begin by briefly discussing our experimental methodology, performance metrics, and 
optimizations used to speed up the experiments. 

6.1.1 Methodology 

For each of the settings we consider, we generate three data sets: a training set, a validation 
set, and a test set. The training set is used as input to Training Problem 2, which in turn 
yields classifiers h w and corresponding payment rules p w . For each choice of the parameter 
C of Training Problem 2, and the parameter 7 if the RBF kernel is used, a classifier h w is 
learned based on the training set and evaluated based on the validation set. The classifier 
with the highest accuracy on the validation set is then chosen and evaluated on the test set. 
During training, we take the perspective of agent 1, so a training set size of t means that 
we train an SVM on I examples. Once a partial outcome rule has been learned, however, 
it can be used to infer payments for all agents. We exploit this fact during testing, and 
report performance metrics across all agents for a given instance in the test set. 

6.1.2 Metrics 

We employ three metrics to measure the performance of the learned classifiers. These 
metrics are computed over the test set {(9 k , o k )Y k=l . 

Classification accuracy Classification accuracy measures the accuracy of the trained 
classifier in predicting the outcome. Each instance of the £ instances has n agents, so in 
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total we measure accuracy over n£ instances p] 



accuracy = 100 • 



n£ 



Ex post regret We measure ex post regret by summing over the ex post regret experi- 
enced by all agents in each of the I instances in the dataset, i.e., 

regret = 



nt 

Individual rationality violation This metric measures the fraction of individual ra- 
tionality violation across all agents: 

= SLst.'(y<M-.)>o). 

n£ 



6.1.3 Optimizations 

In the case of multi-minded CAs we map the inputs 0_i into a smaller space, which 



allows us to learn more effectively with smaller amounts of data 12 We use instance-based 
normalization, which normalizes the values in 9-\ by the highest observed value and then 
rescales the computed payment appropriately, and sorting, which orders agents based on 
bid values. 



Instance-Based Normalization The first technique we use is instance-based normal- 
ization. Before passing examples 9 to the learning algorithm or learned classifier, they are 
normalized by a positive multiplier so that the value of the highest bid by agents other 
than agent 1 is exactly 1, before passing it to the learning algorithm or classifier. The 
values and the solution are then transformed back to the original scale before computing 
the payment rule p w . This technique leverages the observation that agent l's allocation 
depends on the relative values of the other agent's reports (scaling all reports by a factor 
should not affect the outcome chosen). 

1 For a given instance 6, there are actually many ways to choose (8i,8~i) depending on the ordering of 
all agents but agent i. We discuss a technique we refer to as sorting in Section [6.1.3| which will choose a 
particular ordering. When this technique is not used, for example in our experiments for the assignment 
problem, we simply fix an ordering of the other agents for each agent i and use the same ordering across 
all instances. 

12 The barrier to using more data is not the availability of the data itself, but the time required for training, 
because training time scales quadratically in the size of the training set due to the use of non-linear kernels. 
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Sorting The second technique we use is sorting. With sorting, instead of choosing an 
arbitrary ordering of agents in 9—i, we choose a specific ordering based on the maximum 
value the agent reports. In the single-item setting, this amounts to ordering agents by their 
value. In the multi-minded CA setting, agents are ordered by the value they report for 
their most desired bundle. The intuition behind sorting is that we can again decrease the 
space of possible 9-i reports the learner sees and learn more quickly. In the single-item 
case, we know that the second price payment rule only depends on the maximum value 
across all other agents, and sorting places this value in the first coordinate of 9-i. 

6.2 Single-Item Auction 

As a sanity check, we perform experiments on the single-item auction with the optimal 
outcome rule, where the agent with the highest bid receives the item. In the single- 
item case, we run experiments where D is the distribution where agent values are drawn 
independently and uniformly from [0, 1]. The outcome rule g we use is the value-maximizing 
rule, i.e., the agent with the highest value receives the item. We use a training set size of 
300 and validation and test set sizes of 1000. In this case, we know that the associated 
payment function that makes (g,p) strategyproof is the second price payment rule. 

The results reported in Table [T] and Figure [T] are for the XijX2 attribute maps, which 
can be applied to this setting by observing that single-item auctions are a special case of 
multi-minded CAs. In particular, letting z be the vector of dimension n — 1, xi(9-ii °i) = 
(9-i, z) if oi = and o{) = (z, 9-\) if o\ = {1} and X2(#-i, oi) = 9-\ if o\ = and 

X2(6Li,oi) = z if oi = {1}. 

For both choices of the attribute map we obtain excellent accuracy and very close 
approximation to the second-price payment rule. This shows that the framework is able 
to automatically learn the payment rule of Vickrey's auction. 



n 


accuracy 


rcf 


$ret 


ir-violation 


Xi 


X2 


Xi 


X2 


Xi 


X2 


2 


99.7 


93.1 


0.000 


0.003 


0.00 


0.07 


3 


98.7 


97.6 


0.000 


0.000 


0.01 


0.00 


4 


98.4 


99.1 


0.000 


0.000 


0.00 


0.01 


5 


97.3 


96.6 


0.001 


0.001 


0.02 


0.00 


6 


97.6 


97.4 


0.000 


0.001 


0.00 


0.02 



Table 1: Performance metrics for single- item auction. 
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Value of agent 2 Value of agent 2 

Figure 1: Learned payment rule vs. second-price payment rule for single- item auction with 
2 agents, for \i (left) and X2 (right). 

6.3 Multi-Minded CAs 
6.3.1 Type Distribution 

Recall that in a multi-minded setting, there are r items, and each agent is interested in 
exactly b bundles. For each bundle, we use the following procedure (inspired by Sandholm's 
decay distribution for the single-minded setting [23] ) to determine which items are included 
in the bundle. We first assign an item to the bundle uniformly at random. Then with 
probability a, we add another random item (chosen uniformly from the remaining items), 
and with probability (1 — a) we stop. We continue this procedure until we stop or have 
exhausted the items. We use a = 0.75 to be consistent with [23J, as they report that the 
winner determination problem (finding the feasible allocation that maximizes total value) 
is difficult for this setting of a. 

Once the bundle identities have been determined, we sample values for these bundles. 
Let c be an r-dimensional vector with entries chosen uniformly from (0, 1]. For each agent i, 
let dj be an r-dimensional vector with entries chosen uniformly from (0, 1]. Each entry of 
c denotes the common value of a specific item, while each entry of dj denotes the private 
value of a specific item for agent i. The value of bundle Sij is then given by 

. /(%,/3c + (l-/3)d. J )\ c 
va = mm — - 

-V<^ V '• J 

for parameters f3 6 [0, 1] and £. The inner product in the numerator corresponds to a 
sum over values of items, where common and private values for each item are respectively 
weighted with (3 and (1 — j3). The denominator normalizes all valuations to the interval 
(0,1]. Parameter £ controls the degree of complementarity among items: £ > 1 implies 
that goods are complements, whereas £ < 1 means that goods are substitutes. Choosing 
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the minimum over bundles S^i contained in Sij finally ensures that the resulting valuations 
are monotonic. 



6.3.2 Outcome Rules 

We use two outcome rules in our experiments. For the optimal outcome rule, the payment 
rule p vcg makes the mechanism (g op t,Pvcg) strategyproof. Under this payment rule, agent 
i pays the externality it imposes on other agents. That is, 

Pvcg,i(0) = max^ Vj{6i,Qi) J - y)vi(0i,9i(0))- 
\ n ¥i / ¥i 

The second outcome rule with which we experiment is a generalization of the greedy 
outcome rule for single-minded CA Lehmann et al. [16]. Our generalization of the greedy 
rule is as follows. Let 9 be the agent valuations and Oi(j) denote the j-th bundle desired by 
agent i. For each bundle Oj(j), assign a score Uj (#j , Oj (j ) )/yJ | Oj (j ) | , where |oj(j)| indicates 
the total items in bundle Oi(j). The greedy outcome rule orders the desired bundles by 
this score, and takes the bundle Oi(j) with the next highest score as long as agent i has not 
already been allocated a bundle and Oj(j) does not contain any items already allocated. 
While this greedy outcome rule has an associated payment rule that makes it strategyproof 
in the single-minded case, it is not implementable in the multi-minded case as the example 
in Appendix |B] shows. 



6.3.3 Description of Experiments 

We experiment with training sets of sizes 100, 300, and 500, and validation and test sets 
of size 1000. All experiments we report on are for a setting with 5 agents, 5 items, and 3 
bundles per agent, and use /3 = 0.5, the RBF kernel, and parameters C G {10 4 , 10 5 } and 
7 € {0.01,0.1,1}. 



6.3.4 Basic Results 

Table [2] presents the basic results for multi- minded CAs with optimal and greedy outcome 
rules, respectively. For both outcome rules, we present the results for p vcg as a baseline. 
Because p vcg is the strategyproof payment rule for the optimal outcome rule, p vcg always 
has accuracy 100, regret 0, and IR violation for the optimal outcome rule. 

Across all instances, as expected, accuracy is negatively correlated with regret and ex 
post IR violation. The degree of complementarity between items, £, as well as the outcome 
rule chosen, has a major effect on the results. Instances with low complementarity (£ = 0.5) 
yield payment rules with higher regret, and xi performs better on the greedy outcome 
rule while \2 performs better on the optimal outcome rule. For high complementarity 
between items the greedy outcome tends to allocate all items to a single agent, and the 
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Optimal outcome rule Greedy outcome rule 

accuracy regret ir-violation accuracy regret ir-violation 

n C Pvcg Xl X2 Pvcg Xl X2 Pvcg Xl X2 Pvcg Xl X2 Pvcg Xl X2 Pvcg Xl X2 

2 0.5 100 70.7 91.9 0.014 0.002 0.0 0.06 0.03 50.9 59.1 40.6 0.079 0.030 0.172 0.22 0.12 0.33 

3 0.5 100 54.5 75.4 0.037 0.017 0.0 0.19 0.10 55.4 57.9 54.7 0.070 0.030 0.088 0.18 0.21 0.36 

4 0.5 100 53.8 67.7 0.042 0.031 0.0 0.22 0.18 61.1 58.2 57.9 0.056 0.033 0.056 0.14 0.20 0.31 

5 0.5 100 15.8 67.0 0.133 0.032 0.0 0.26 0.19 64.9 61.3 63.0 0.048 0.027 0.042 0.13 0.19 0.24 

6 0.5 100 61.1 68.2 0.037 0.032 0.0 0.22 0.20 66.6 63.8 63.8 0.041 0.034 0.045 0.12 0.20 0.24 

2 1.0 100 84.5 93.4 0.008 0.001 0.0 0.08 0.02 87.8 86.6 84.0 0.007 0.005 0.008 0.04 0.06 0.09 

3 1.0 100 77.1 83.5 0.012 0.005 0.0 0.13 0.09 85.3 86.7 85.7 0.006 0.006 0.006 0.04 0.07 0.05 

4 1.0 100 74.6 81.1 0.014 0.009 0.0 0.16 0.12 82.4 86.5 84.2 0.006 0.006 0.007 0.05 0.08 0.08 

5 1.0 100 73.4 77.4 0.018 0.011 0.0 0.19 0.12 82.7 85.8 84.9 0.007 0.009 0.009 0.04 0.10 0.10 

6 1.0 100 75.0 77.7 0.020 0.013 0.0 0.20 0.16 80.0 87.4 88.1 0.006 0.007 0.005 0.04 0.08 0.07 

2 1.5 100 91.5 96.9 0.004 0.000 0.0 0.06 0.02 94.7 91.1 91.7 0.002 0.002 0.002 0.02 0.04 0.04 

3 1.5 100 91.0 93.4 0.004 0.001 0.0 0.05 0.03 97.1 92.8 93.2 0.001 0.002 0.001 0.01 0.02 0.04 

4 1.5 100 92.5 94.2 0.003 0.001 0.0 0.03 0.04 96.4 91.5 92.1 0.001 0.003 0.002 0.02 0.07 0.07 

5 1.5 100 91.7 93.9 0.004 0.002 0.0 0.06 0.03 97.5 90.5 91.4 0.001 0.004 0.002 0.01 0.06 0.04 

6 1.5 100 91.9 93.7 0.003 0.001 0.0 0.05 0.04 98.4 92.2 92.8 0.000 0.003 0.002 0.01 0.06 0.06 



Table 2: Results for multi- minded CA with training set size 500. 



learned price function sets high prices for small bundles to capture this property. For 
low complementarity the allocation tends to be split and less predictable. Still, the best 
classifiers achieve average ex post regret of less than 0.032 (for values normalized to [0,1]) 
even though the corresponding prediction accuracy can be as low as 67%. For the greedy 
outcome rule, the performance of p vcg is comparable for £ 6 {1-0, 1.5} but worse than the 
payment rule learned in our framework in the case of £ = 0.5, where the greedy outcome 
rule becomes less optimal. 

6.3.5 Effect of Training Set Size 

Table [3] charts performance as the training set size is varied for the greedy outcome rule. 
While training data is readily available (we can simply sample from D and run the outcome 
rule g), training time becomes prohibitive for larger training set sizes. Table [3] shows that 
regret decreases with larger training sets, and for a training set size of 500, the best of xi 
and X2 outperforms p vcg for ( = 0.5 and is comparable to p vcg for £ G {1.0, 1.5}. 

6.3.6 IR Fixes 

Table [4] summarizes our results regarding the various fixes to IR violations, for the partic- 
ularly challenging case of the greedy outcome rule and £ = 0.5. The extent of IR violation 
decreases with larger payment offset and null loss. Regret tends to move in the opposite 
direction, but there are cases where IR violation and regret both decrease. The three 
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n 


c 


accuracy 


100 


300 


500 


regret 


100 


300 


500 


Pvcg 


Xi 


X2 


Xi 


X2 


Xi 


X2 


Pvcg 


Xi 


X2 


Xi 


X2 


Xi 


X2 


2 


0.5 


50.9 


54.3 


48.2 


57.0 


46.9 


59.1 


40.6 


0.079 


0.045 


0.195 


0.032 


0.098 


0.030 


0.172 


3 


0.5 


55.4 


50.1 


49.8 


55.7 


54.4 


57.9 


54.7 


0.070 


0.054 


0.078 


0.038 


0.082 


0.030 


0.088 


4 


0.5 


61.1 


53.4 


56.2 


56.4 


58.5 


58.2 


57.9 


0.056 


0.050 


0.059 


0.040 


0.061 


0.033 


0.056 


5 


0.5 


64.9 


14.2 


57.9 


61.0 


61.8 


61.3 


63.0 


0.048 


0.173 


0.064 


0.038 


0.048 


0.027 


0.042 


6 


0.5 


66.6 


58.4 


58.8 


62.2 


63.9 


63.8 


63.8 


0.041 


0.039 


0.059 


0.037 


0.049 


0.034 


0.045 


2 


1.0 


87.8 


80.7 


80.5 


84.4 


84.1 


86.6 


84.0 


0.007 


0.010 


0.010 


0.009 


0.008 


0.005 


0.008 


3 


1.0 


85.3 


74.9 


78.0 


83.0 


80.6 


86.7 


85.7 


0.006 


0.020 


0.011 


0.009 


0.009 


0.006 


0.006 


4 


1.0 


82.4 


78.5 


80.1 


84.2 


83.1 


86.5 


84.2 


0.006 


0.015 


0.014 


0.008 


0.009 


0.006 


0.007 


5 


1.0 


82.7 


81.0 


81.8 


84.3 


84.3 


85.8 


84.9 


0.007 


0.020 


0.014 


0.010 


0.009 


0.009 


0.009 


6 


1.0 


80.0 


81.8 


83.7 


87.6 


88.3 


87.4 


88.1 


0.006 


0.062 


0.018 


0.008 


0.005 


0.007 


0.005 


2 


1.5 


94.7 


83.3 


88.1 


89.3 


89.8 


91.1 


91.7 


0.002 


0.008 


0.003 


0.003 


0.002 


0.002 


0.002 


3 


1.5 


97.1 


86.9 


87.6 


90.3 


91.5 


92.8 


93.2 


0.001 


0.005 


0.004 


0.003 


0.002 


0.002 


0.001 


4 


1.5 


96.4 


88.4 


90.7 


89.3 


90.8 


91.5 


92.1 


0.001 


0.005 


0.003 


0.004 


0.003 


0.003 


0.002 


5 


1.5 


97.5 


87.2 


88.5 


91.4 


90.5 


90.5 


91.4 


0.001 


0.006 


0.004 


0.003 


0.003 


0.004 


0.002 


6 


1.5 


98.4 


86.3 


86.8 


91.4 


92.5 


92.2 


92.8 


0.000 


0.011 


0.007 


0.004 


0.002 


0.003 


0.002 



Table 3: Effect of training set size on accuracy of learned classifier. Multi-minded CA, 
greedy outcome rule. Training set size is given in the column labels for Xi>X2- Pvcg does 
not have a training set size. 

payment accuracy regret ir-violation ir-fix-welfare-avg 

offset 0.5 1.0 1.5 0.5 1.0 1.5 0.5 1.0 1.5 0.5 1.0 1.5 

0.065 0.048 0.042 0.35 0.26 0.21 0.27 0.43 0.52 

0.054 0.045 0.044 0.29 0.20 0.15 0.37 0.54 0.65 

0.048 0.047 0.051 0.23 0.14 0.10 0.48 0.66 0.75 

0.047 0.055 0.064 0.17 0.10 0.06 0.59 0.75 0.84 

0.052 0.067 0.079 0.12 0.06 0.03 0.70 0.83 0.90 

0.061 0.082 0.096 0.08 0.03 0.02 0.79 0.89 0.93 






59.7 


61.8 


61.7 


0.05 


61.7 


61.2 


60.1 


0.10 


62.1 


59.3 


56.7 


0.15 


60.4 


55.1 


52.2 


0.20 


57.8 


51.7 


48.5 


0.25 


54.3 


47.7 


44.3 



Table 4: Impact of payment offset and null loss fix for £ = 0.5 and greedy outcome rule, 
training set size 300. All results are for \2, null loss values appear in the second row. 
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accuracy 
X2 X2 (i-mon) 


X2 


regret 

X2 (i-mon) 


ir- 

X2 


violation 
\2 (i-mon) 


2 0.5 46.9 


46.3 


0.098 


0.232 


0.28 


0.38 


3 0.5 54.4 


8.6 


0.082 


0.465 


0.33 


0.06 


4 0.5 58.5 


48.2 


0.061 


0.811 


0.31 


0.25 


5 0.5 61.8 


57.0 


0.048 


0.136 


0.26 


0.26 


6 0.5 63.9 


61.3 


0.049 


0.078 


0.25 


0.20 


2 1.0 84.1 


82.2 


0.008 


0.010 


0.06 


0.08 


3 1.0 80.6 


80.1 


0.009 


0.010 


0.10 


0.09 


4 1.0 83.1 


79.7 


0.009 


0.012 


0.11 


0.11 


5 1.0 84.3 


77.2 


0.009 


0.020 


0.10 


0.11 


6 1.0 88.3 


83.9 


0.005 


0.013 


0.08 


0.11 


2 1.5 89.8 


89.1 


0.002 


0.003 


0.03 


0.06 


3 1.5 91.5 


91.3 


0.002 


0.003 


0.04 


0.04 


4 1.5 90.8 


89.7 


0.003 


0.003 


0.06 


0.06 


5 1.5 90.5 


87.3 


0.003 


0.005 


0.04 


0.05 


6 1.5 92.5 


70.8 


0.002 


0.081 


0.06 


0.17 



Table 5: Comparison of performance with and without optimistically assuming item mono- 
tonicity. (i-mon) indicates a payment rule learned by optimistically assuming item mono- 
tonicity. Greedy outcome rule. Training set size 300. 

rightmost columns of Table [4] list the average ratio between welfare after and before the 
deallocation fix, across the instances in the test set. With a payment offset of 0, a large 
welfare hit is incurred if we deallocate agents with IR violations. However, this penalty 
decreases with increasing payment offsets and increasing null loss. At the most extreme 
payment offset and null loss adjustment, the IR violation is as low as 2%, and the deallo- 
cation fix incurs a welfare loss of only 7%. 

Figure [2] shows a graphical representation of the impact of payment offsets and null 
losses. Each line in the plot corresponds to a payment rule learned with a different null 
loss, and each point on a line corresponds to a different payment offset. The payment 
offset is zero for the top-most point on each line, and equal to 0.29 for the lowest point 
on each line. Increasing the payment offset always decreases the rate of IR violation, but 
may decrease or increase regret. Increasing null loss lowers the top- most point on a given 
line, but arbitrarily increasing null loss can be harmful. Indeed, in the figure on the left, a 
null loss of 1.5 results in a slightly higher top- most point but significantly lower regret at 
this top-most point compared to a null loss of 2.0. It is also interesting to note that these 
adjustments have much more impact on the hardest distribution with £ = 0.5. 
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accuracy 
vcg tot-vcg eg-vcg p. 

2 64.3 67.5 6T5 

3 48.0 52.1 42.5 

4 40.6 43.1 30.8 

5 32.4 35.3 24.5 

6 27.1 29.9 20.0 



regret 
vcg tot-vcg eg-vcg p, 



ir-violation 
vcg tot-vcg eg-vcg p w 

Ml 0.01 0.03 

0.07 0.03 0.04 

0.09 0.03 0.02 

0.12 0.03 0.01 

0.13 0.03 0.01 



89.0 0.018 0.015 

77.9 0.070 0.077 

71.0 0.111 0.123 

63.9 0.157 0.169 

59.0 0.189 0.208 



0.015 0.023 0.03 

0.127 0.041 0.06 

0.199 0.054 0.07 

0.254 0.071 0.10 

0.290 0.074 0.10 



Table 6: Results for assignment problem with egalitarian outcome rule 



6.3.7 Item Monotonicity 

Table [5] presents a comparison of a payment rule learned with explicit enumeration of all 
bundle constraints (the default that we have been using for our other results) and a payment 



rule learned by optimistically assuming item monotonicity (see Section 5.1.3). Performance 
is affected when we drop constraints and optimistically assume item monotonicity, although 
the effects are small for £ G {1-0, 1.5} and larger for £5 = 0.5. Because item monotonicity 
allows for the training problem to be succinctly specified, we may be able to train on more 
data, and this seems a very promising avenue for further consideration (perhaps coupled 
with heuristic methods to add additional constraints to the training problem). 



6.4 The Assignment Problem 

In the assignment problem, agents' values for the items are sampled uniformly and inde- 
pendently from [0,1]. We use a training set of size 600, validation and test sets of size 
1000, and the RBF kernel with parameters C G {10, 1000, 100000} and 7 G {0.1, 0.5, 1.0}. 

The performance of the learned payment rules is compared to that of three VCG-based 
payment rules. Let W be the total welfare of all agents other than i under the outcome 
chosen by g, and W eg be the minimum value any agent other than i receives under this 
outcome. We then consider the following payment rules: (1) the vcg payment rule, where 
agent i pays the difference between the maximum total welfare of the other agents under 
any allocation and W; (2) the tot-vcg payment rule, where agent i pays the difference 
between the total welfare of the other agents under the allocation maximizing egalitarian 
welfare and W; and (3) the eg-vcg payment rule, where agent i pays the difference between 
the minimum value of any agent under the allocation maximizing egalitarian welfare and 

Weg- 

The results for attribute map X3 are shown in Table [6| We see that the learned payment 
rule p w yields significantly lower regret than any of the VCG-based payment rules, and 
average ex post regret less than 0.074 for values normalized to [0,1]. Since we are not 
maximizing the sum of values of the agents, it is not very surprising that VCG-based 
payment rules perform rather poorly. The learned payment rule p w can adjust to the 
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outcome rule, and also achieves a low fraction of ex post IR violation of at most 3%. 

7 Conclusions 

We have introduced a new paradigm for computational mechanism design in which statisti- 
cal machine learning is adopted to design payment rules for given algorithmically specified 
outcome rules, and have shown encouraging experimental results. Future directions of in- 
terest include (1) an alternative formulation of the problem as a regression rather than 
classification problem, (2) constraints on properties of the learned payment rule, concern- 
ing for example the core or budgets, (3) methods that learn classifiers more likely to induce 
feasible outcome rules, so that these learned outcome rules can be used, (4) optimistically 
assuming item monotonicity and dropping constraints implied by it, thereby allowing for 
better scaling of training time with training set size at the expense of optimizing against 
a subset of the full constraints in the training problem, and (5) an investigation of the ex- 
tent to which alternative goals such as regret percentiles or interim regret can be achieved 
through machine learning. 
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A Efficient Computation of Inner Products 

For both \i anci X2, computing inner products reduces to the question of whether inner 
products between valuation profiles are efficiently computable. For xi, we have that 

n 

(Xi(0-i, oi), Xi (#_!, 0i)> = I 01=0 ; £ (ft. . 

i=2 

where indicator I 01=0 ' = 1 if o\ = o[ and I 01=0 ' = otherwise. For X2, 

n 

( X2 (0-l, Ol), X2(^-l, 0i)> = E ^ \ °1> °i \ °l) • 

We next develop efficient methods for computing the inner products {6i,0'^) on com- 
pactly represented valuation functions. The computation of (0j \oi,0^\ oi) can be done 
through similar methods. 

In the single-minded setting, let Q{ correspond to a bundle Si C {1, . . . , r} of items with 
value Uj, and ^ correspond to a set S[ C {1, . . . , r} of items valued at t^. 

Each set containing both Sj and contributes to 0j6[, while all other sets con- 
tribute 0. Since there are exactly '2 r '-\ s ^ vjS i\ sets containing both Si and S[, we have 

eje'^vivp-^^. 

This is a special case of the formula for the multi-minded case. 
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Lemma 2. Consider a multi-minded CA and two bid vectors x\ and x[ corresponding to 
sets S = {Si, . . . , S s } and S' = {S[, . . . , S' t }, with associated values vi, . . . ,v s and v[, . . . ,v' t . 
Then, 

T i ST^ (t i\|T|+|T'| i- \ (■ i \ r, r_ l(Us 1 eT S, i)U(Us' e T' S j)l\ /„n 

Li, = > —1 1 11 1 • minft • mm vA -2 % j . (4) 

^ \ Si&T 5'eT' 3 J 

TCS,T'CS' J 

Proof. The contribution of a particular bundle B of items to the inner product is 
(max5.e55.cBUi) • (maxs'es'.s'cB^), and thus 



T . 
X l X 



1 = 7 ( (max vA ■ ( max v'A). 



j 

S'.CB 

J - 



By the maximum-minimums identity, which asserts that for any set {xi, . . . ,x n } of n 
numbers, max{xi, . . .,x n } = ]Czcx(( _1 )' Z ' +1 ' (min^ez Xi)), 



maxui = ((-1) |T|+1 • (™™ v i)) and 

S-CS TCS 

max v'a = ( (— ' +1 • f min 1/) ). 
s'.gs' J ^— ' V v S'PT' 3 ) 



S J QB U s / eT /^CB 



T'CS' 5 J' eT ' 



The inner product can thus be written as 



« = £ E ((-l) |T|+|T 'Mmm^-(min^.; 

B TCS,T'CS' 1 i 



,»-|(Uff 4€ T^)U(Us'. 6 T" S i)l 



Finally, for given T C S 1 and T' C 5', there exist exactly 2 * i 3 bundles 

-B such that (Js^sT ^ — B and IJs'.eT' ^ — ^1 and we obtain 



'1 fi = / —l) 1 1 1 1 • miEi); • nun » -2 1 3 

^ V Si&T S'eT' J 



TCS,T'CS' 

□ 

If S and S" have constant size, then the sum on the right hand side of Q ranges over 
a constant number of sets and can be computed efficiently. 
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B Greedy Allocation Rule is not Weakly Monotone 



Consider a setting with a single agent and four items. 
If the valuations 6\ of the agent are 



t>l(01,Ol) 



'20 if 01 = {1,2,3,4} 

12 if 1 G o\ and j ^ o\ for some j G {2, 3, 4}, and 
else 



then the allocation is {1}. 

If the valuations are 9[ such that 



t>i(0i,°i) 



12 if 0l = {1,2,3,4} 

5 if 1 G 0\ and j ^ oi for some j G {2, 3, 4}, and 
else 



then the allocation is {1, 2, 3, 4}. 

We have ui(0j, {1, 2, 3, 4}) - Vl (9[, {1}) < «i(0i, {1, 2, 3,4}) - vi(0i, {1}) contradicting 
weak monotonicity. 



32 



